Executing linear algebra kernels in heterogeneous distributed infrastructures with PyCOMPSs
نویسندگان
چکیده
منابع مشابه
Code Generation of Optimized Distributed-Memory Dense Linear Algebra Kernels
Design by Transformation (DxT) is an approach to software development that encodes domain-specific programs as graphs and expert design knowledge as graph transformations. The goal of DxT is to mechanize the generation of highly optimized code. This paper demonstrates how DxT can be used to transform sequential specifications of an important set of Dense Linear Algebra (DLA) kernels, the level-...
متن کاملAccelerating GPU Kernels for Dense Linear Algebra
Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major building block of dense linear algebra (DLA) libraries, and therefore have to be highly optimized. We present some techniques and implementations that significantly accelerate the corresponding routines from currently available libraries for GPUs. In particular, Pointer Redirecting – a set of GPU specific optimiz...
متن کاملLoad Balancing Strategies for Dense Linear Algebra Kernels on Heterogeneous Two-Dimensional Grids
متن کامل
Data Allocation Strategies for Dense Linear Algebra Kernels on Heterogeneous Two-dimensional Grids
We study the implementation of dense linear algebra computations, such as matrix multiplication and linear system solvers, on two-dimensional (2D) grids of heterogeneous processors. For these operations, 2D-grids are the key to scalability and eÆciency. The uniform block-cyclic data distribution scheme commonly used for homogeneous collections of processors limits the performance of these opera...
متن کاملDense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach
While the first two involve fundamental physical limitations that current technology trends are unlikely to overcome in the near term, the third is an obvious consequence of the first two, combined with the economic necessity of using many thousands of computational units to scale up to petascale and larger systems. More transistors and slower clocks require multicore designs and an increased p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Oil & Gas Science and Technology – Revue d’IFP Energies nouvelles
سال: 2018
ISSN: 1294-4475,1953-8189
DOI: 10.2516/ogst/2018047